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Abstract:  This  paper  presents  an  algorithm,  the  combined  Schubert/secant/finite  difference  algo¬ 
rithm,  for  solving  sparse  nonlinear  systems  of  equations.  This  algorithm  is  based  on  dividing  the 
columns  of  the  Jacobian  into  two  parts,  and  using  different  algorithms  on  each  part.  This  algo¬ 
rithm  incorporates  advantages  of  both  algorithms  by  exploiting  some  special  structure  of  the  Jaco¬ 
bian  to  obtain  a  good  approximation  to  the  Jacobian  by  using  as  little  effort  as  possible. 
Kantorovich- type  analysis  and  a  locally  g-superlinear  convergence  result  for  this  algorithm  are 
given. 


Key  words:  finite  difference,  Jacobian,  g-superlinear  convergence,  Kantorovich  type  analysis, 
sparsity,  nonlinear  system  of  equations. 


1.  Introduction. 


Consider  the  nonlinear  system  of  equations 

Fix)  =  0  ,  (1-D 

where  F:Rn  -+Rn  is  continuously  differentiable  on  an  open  convex  set  D  C R'1,  and 

the  Jacobian  matrix  F'ix)  is  sparse.  To  solve  the  system,  we  use  the  iteration 

x=x-B~1F(x),  (1-2) 

where  x  is  the  current  iterate,  x  is  the  new  iterate,  and  B  is  an  approximation  to 

F'ix),  which  has  the  same  sparsity  as  the  Jacobian. 

Suppose  we  have  finished  the  current  iteration.  Then  the  information  we  have 
is  x,  x,  F(x),  Fix),  B.  The  purpose  of  this  paper  is  to  find  a  matrix  B  which  is  a 
good  approximation  to  F'ix)  but  to  economize  on  the  number  of  function  evalua¬ 
tions  required  for  this  approximation. 

In  1970  Schubert  [11]  gave  a  sparse  modification  of  Broyden’s  [1]  update. 
Broyden  [2]  also  gave  this  algorithm  independently.  In  order  to  present  Schubert’s 
algorithm,  we  introduce  the  following  notation  concerning  the  sparsity  pattern  of 
the  Jacobian: 

Definition  1.1.  For  j  =  l,2,...,n  define  the  subspace  ZjCRn  determined  by  the  spar¬ 
sity  pattern  of  the  j th  row  of  the  Jacobian: 

Zj^={v€Rn:  ejv  =  0  for  all  i  such  that  [F'(x)]7,=  0  for  all  x€Rn}, 
where  eL  is  the  ith  column  of  the  nXn  identity  matrix.  Define  the  set  of  matrices  Z 

that  preserve  the  sparsity  pattern  of  the  Jacobian: 

Z  =  {AiL{Ra)\  ATeJiZ]  for  j  =  1,2,..., a  /. 
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Definition  1.2.  For  j  =  l,2,...,n,  define  the  projection  operator,  DjZL(Rn),  that  maps 


R n  onto  Zj\ 


where 


Dj  =  diag  (djlydj2,  ■  ■  .  ,djn). 


1,  if  €  Zj> 
0,  otherwise. 
For  a  scalar  a^R,  define  the  pseudo-inverse: 


*Ji  = 


a+  = 


a  l,  if  a  *  0, 
0,  if  a  =  0. 


Now  Schubert’s  update  is  formulated  as  follows: 


B  =  B  +  “  Bs^L  t1-3) 

7=1 

where  [s^  =  DjS,  s  =  x  —  x  and  y  =  F(x)  -  Fix). 

The  advantage  of  Schubert’s  algorithm  is  that  at  each  iteration  only  one  func¬ 
tion  value  is  required,  and  it  is  q-superlinearly  convergent  (see  Marwil  [8]).  How¬ 
ever  it  usually  requires  more  iterations  than  finite  difference  algorithms  (see  Li 
[7]). 

Curtis,  Powell,  and  Reid  [4]  proposed  a  finite  difference  algorithm,  called  the 
CPR  algorithm,  which  is  based  on  a  partition  of  the  columns  of  the  Jacobian.  Cole¬ 
man  and  More'  [3]  associate  the  partition  problem  with  a  graph  coloring  problem 
and  gave  some  partitioning  algorithms  which  can  make  the  number  of  function 
evaluations  needed  to  approximate  the  Jacobian  by  CPR  algorithm  optimal  or 
nearly  optimal. 

Following  Coleman  and  More',  we  give  some  definitions  concerning  a  partition 
of  the  columns  of  the  Jacobian. 
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Definition  1.3.  A  partition  of  the  columns  of  a  matrix  B  is  a  division  of  the  columns 
into  groups  c1,C2v>cp  such  that  each  column  belongs  to  one  and  only  one  group. 

Definition  1.4.  A  partition  of  the  columns  of  a  matrix  B  is  consistent  with  the  direct 
determination  of  B  if  whenever  is  a  nonzero  element  of  B,  then  the  group  con¬ 
taining  column  j  has  no  other  column  with  a  nonzero  element  in  row  i. 

The  CPR  algorithm  can  be  formulated  as  follows:  for  a  given  consistent  parti¬ 
tion  of  the  columns  of  the  Jacobian,  which  divides  the  set  {1, n}  into  p  subsets 
ci,...,cp  (for  convenience,  c,,  i  =  l,2,...,p,  indicates  both  the  sets  of  the  columns  and 
the  sets  of  the  indices  of  these  columns),  obtain  vectors  di,di,...,d p  such  that  B  is 
determined  uniquely  by  the  equations 

Bdt  =  F(x  +  dt)  -  Fix )  ^  yt  i  =  l,2,...,p  .  d-4) 

Notice  that  for  the  CPR  algorithm,  the  number  of  function  evaluations  at  each 

iteration  is  p  +  1.  Since  the  partition  of  the  columns  of  the  Jacobian  plays  an  impor¬ 
tant  role  in  the  CPR  algorithm,  we  call  the  CPR  algorithm  based  on  Coleman  and 
More’s  algorithms  the  CPR-CM  algorithm. 

The  advantage  of  the  CPR  algorithm  is  that  it  usually  requires  fewer  itera¬ 
tions  than  Schubert’s  algorithm.  However,  it  requires  more  function  values  at  each 
iteration  than  Schubert’s  algorithm  (see  Li  [7]). 

In  an  early  paper  [7],  we  proposed  an  algorithm  called  the  secant/finite 
difference  (SFD)  algorithm,  which  is  also  based  on  a  consistent  partition  of  the 
columns  of  the  Jacobian.  However,  it  uses  the  information  we  already  have  at 
every  iterative  step  more  efficiently  than  the  CPR  algorithm.  Let 
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i  =  1,2 , 


(1.5) 


dj  2  sjej  > 

j*c, 


gi  =  ^dj,  g  o  =  0,  (1.6) 

)=i 

and 


J,  =F(J-gl_1)-F(x-gl),  t  =  1,2,  ...,p  ,  (1.7) 

where  s,  =  xt  —  xL  indicate  the  ith  component  of  s.  The  SFD  algorithm  can  be  for¬ 
mulated  as  follows:  If  s }  ^0,  for  some  j  €  cn  then  the  jth  column  of  B  is  determined 
uniquely  by  equations 

Bdi  =  yi  . 

If  Sj  =  0,  then  the  jth  column  of  B  is  equal  to  the  j  th  column  of  B. 

Since 


(1.8) 


yx  =  F{x  -  g 0)  -  Fix  -  gx)  =  FiF)  -  Fix  -  gx) , 
yp  =  Fix  -  -  Fix  -  gp)  =  Fix  -  gp_  i)  -  Fix)  , 

the  number  of  function  evaluations  required  by  the  SFD  algorithm  at  each  itera¬ 
tion  is  one  less  than  that  required  by  CPR-CM  algorithm. 

Now  consider  the  example 
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5,6,7/  is 

an  optimal  consistent 

partition  of  the  columns  of  the  Jacobian.  For  this  problem,  the  CPR-CM  algorithm 
and  the  SFD  algorithm  require  5  and  4  function  values  at  each  iteration 
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respectively. 


In  this  paper,  we  propose  an  algorithm  called  the  combined 
Schubert/secant/finite  difference  (CSSFD)  algorithm,  which  is  a  combination  of  the 
SFD  algorithm  and  Schubert’s  algorithm  (including  Broyden’s  algorithm).  For  some 
problems,  this  algorithm  can  reduce  the  number  of  function  values  required  at  each 
iteration  to  fewer  than  the  SFD  algorithm  by  considering  special  structure  of  the 
Jacobian.  For  example  (1.9),  the  number  of  function  evaluations  is  2. 

The  CSSFD  algorithm  and  its  properties  are  given  in  Section  2.  A 
Kantorovich-type  analysis  for  this  algorithm  is  given  in  Section  3.  A  q-superlinear 
convergence  result  is  given  in  Section  4. 

In  this  paper,  L(R'1)  denotes  the  linear  space  of  all  real  nXn  matrices,  ||.||f 
indicates  the  Frobenius  norm  of  a  matrix,  and  ||.||  indicates  the  /2-vector  norm. 

2.  The  CSSFD  Algorithm  and  its  Properties. 

Consider  example  (1.9).  The  first  3  columns  of  the  matrix  are  denser  than  the 
other  columns,  and  this  makes  p,  the  number  of  the  groups  in  the  partition,  at 
least  4.  The  CSSFD  algorithm  divides  the  columns  of  the  Jacobian  into  two  parts, 
and  uses  different  algorithms  on  each  part. 

We  say  a  group  of  the  columns  of  a  matrix  has  'good  sparsity’  if  the  columns 
in  this  group  have  few  nonzeros  in  the  same  row  position.  Otherwise,  we  say  the 
group  of  the  columns  has  'bad  sparsity’. 

Suppose  the  columns  of  the  Jacobian  can  be  divided  into  two  groups  —  the 
good  sparsity  group  c  and  the  bad  sparsity  group  cx.  For  convenience,  we  use  c  and 
cx  to  indicate  both  the  groups  of  the  columns  of  a  matrix  and  the  sets  of  the  indices 


o 


of  these  columns.  Then, 


c  U  cx  =  {l, n}. 

For  any  matrix  A  6  L(Rn),  let 

Ai  =  A  2  ejeJ  ,  A2  =  A  2  ejeJ  ■ 

jtc 1  J*c 

Then  A  =  Ax  +  A2.  The  main  idea  of  the  CSSFD  algorithm  is  to  use  Schubert’s 
update  (including  Broyden’s  update)  on  B±  and  to  use  the  SFD  algorithm  on  B2, 
where  B  =  B  L  +  B  2 . 

In  practice,  there  are  many  ways  to  choose  c  and  Cj.  For  example,  we  can  first 
partition  the  columns  by  using  a  CPR-CM  procedure.  Then,  if  we  can  afford  m  F- 
values  at  each  iteration,  we  can  keep  the  columns  of  the  m  —  1  largest  groups  of  the 
partition  for  c  and  put  all  the  remaining  columns  into  cx. 

Algorithm.  2.1.  Given  a  consistent  partition  of  B2,  which  divides  c  into  p  —  1  subsets 
c2t  c3>  ■■■>  cpt  and  given  an  x°£Rn  and  a  nonsingular  matrix  B0  with  the  same  spar¬ 
sity  as  the  Jacobian,  at  each  step  k  SO: 

(1) .  Solve  Bksx  =  -F{xk). 

(2) .  Choose  xk+l  by  xk  +  1  =  xk  +  s%  or  by  a  global  strategy  such  as  a  trust- 
region  method.  Let  sk  —  xk  +  l  —  xk . 

(3) .  Check  for  convergence. 

(4) .  Update  Bkl  by  Schubert’s  update  to  get  Bk  +  l  l,  and  update  Bk  2  by  the 
SFD  algorithm  to  get  Bk  +  l  2  ■ 

(5) .  Set 

Bk+ i  =  Bk  +  i,i  +  Bk  +  h2  ■ 
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Let 


—  2  sjej’  i  1)  ‘  ‘  '  P  > 

jte, 

i 

gi  =  E  rf./>  1  =  i’-’P  >  £o  =  o  > 

j=i 

yt  =  Fix-g^O  -  Fix  —  gj)  ,  (  =  1,2,  •  •  •  ,p  , 

and 

l 

J,  =  jF'(J  -gj  +  t(&  -g,--i))df,  i  =  l,2, ...,  p  . 
o 

Then, 

<J ldl  V;  ,  (  1, 2. ...,  p  , 

and  the  update  of  Algorithm  2.1  can  be  formulated  as 

B\  =  By  +  2  +  -BidiJtrfiir  , 

i  =  1 

s2  =  B2  +  1  2  s/s/Ji  -  B2)ejeJ  , 

i=2  j€o, 

B  =  B  i  B  2  ■ 

Now  we  give  some  of  the  properties  of  B  obtained  from  (2.5). 


(2.1) 

(2.2) 


(2.3) 


(2.4) 


(2.5) 


Lemma  2.1.  B  satisfies  the  secant  equations 


Bdl  y,  ,  (  1,...,  p  , 

and  (2.6)  implies  that 

Bs  =  F(x)  —  Fix)  =  y  . 

Lemma  2.2.  B  is  the  unique  solution  to 

min/^jB  —  B\\p:  Bdt  =  v,,  i  =  1,  p,  and  B  6  Z  / 


(2.6) 


(2.7) 


i  2.8) 
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The  proof  of  this  lemma  is  similar  to  that  for  Schubert’s  algorithm  given  by  Reid 
[10]  and  Marwil  [8]. 

Theorem  2.3.  If  A  6  L{Rn)  has  the  same  sparsity  as  the  Jacobian,  then 

Pi  -AS  s  IISi  -  Aril  -  -  AM2 

>*■  (2.9) 

+  i  {WTWM+ieTiyi-AdOF  . 

1  =  1 

Proof.  Let  Ex  =  Bx  -  Ax ,  and  Ex  =  Bx  -  Ax.  From  (2.5),  we  have 

ejBx  =  ejBx  +  ([djf  [</,];)  VOh  -  Bxdx)[dx]f  .  (2.10) 

Subtracting  ejAx  from  both  sides  of  (2.10),  and  noticing  that  efBxdx  =  efBx[dx]i 

and  that  ej Axdx  —  ejAx[dx)h  we  obtain 

ejEx  =  ejEx  +  (Wfid^ ef{yx  -  Bxdx)[dx\J 

=  efEx(I  -  (W1]fW1]i)+[d1]iW1]D  (211> 

+  Wj[dx\yej{yx-Axdx)[dl\j  . 

Since  ([d1]f[d1]t)+e,T(y1  -  Axdx)  is  a  scalar,  the  first  and  second  terms  on  the  right 
of  (2.11)  are  perpendicular  to  each  other,  and  we  have 

\\ejExf  =  \\eTEx(I  -([dx]ndxl)+[dx]t[dx]T)\\2  +  WTldjy^Tiy,  -  Axdx)\2 

=  \ejExf-WJ[dxlf)+\ejEx[dx\\2  +  (W1]f[di]i)  +  |<?r0'i  -  Axdx)\2 
<  \\ejExf  -  \ejExdx\2  +  ([djndiW+Wlyi  ~  Aidi^  • 

INI2 

Therefore, 
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IIBi-Ajf  =  2  WejE.f 

i- 1 


+  i  (w1],rwiU+^r(y1-A1d1)]2 


=  HSx-AxIll-^-IKSx-AxHI2 

INI2 


+  i  {[dl]'f[d1]i)+[ef{y1-Aldl)]2  . 

1  =  1 


Theorem  2.4.  If  A  £  L{Rn )  has  the  same  sparsity  as  the  Jacobian,  then 

\\B2  -  A2 III  <  II B2  -  A2fF  -  -~\\(B2  -  A2)s||2 

INI 

+  i  2  •fijtJi-Akjf. 

i=2  j$ c, 


(2.12) 


Proof.  Let  E2  =  B2-A2,  and  E2  =  B2-A2.  It  follows  from  (2.5)  that  if 
J  tcn  i  =  2, ...,  p,  then 

B2e}  =  B2ej  +  s/s/J,  -  B2)ej  .  (2.13) 

Subtracting  A2ej  from  both  sides  of  (2.13),  we  obtain 

E2e}  =  (1  -  Sj+  sJ)E2eJ  +  s  +  Sj(Jt  -  A  2)e}  . 

Since  (1  —  sf~ Sj)s  +  Sj  =  0,  we  have 

WE^f  =  (1  -  S/s,)|[E2e/  +  sJ+sJ\\(Jl  ~  A2)e;||2 

=  ||£2e/  -  s* Sj\\E2ej\\2  +  sJ+sJ\\(Jl  -  A2)ef  . 

Therefore, 


l|£2||f  =  2  ll^-ll2 


(2.14) 


=  \\E2\\2f  ~  2  */  Sj\\E2ej\\2  +  £  2  «/  *  A)e,l|2  • 

y€c  i=2  jtc, 


In  addition, 


2  s/^ll-Eo^ll2  =  II#  2  2  s/  S/ej«7lF 

jtc  J€c 


ll#2  2  s/Sj^Jsl2 


Thus,  (2.12)  follows  from  (2.14). 


ll#2*ll2 

INI2 


3.  A  Kantorovich-type  Analysis. 

To  study  the  convergence  properties  of  Algorithm  2.1,  we  assume  that  F 
satisfies  the  following  Lipschitz  condition:  For  every  i  (  c,  there  exists  y,  >  0,  such 
that 


\\(F'(x)~F'(y))el\\^yl\\x-y\\, 

V/ x,y  £  D  , 

(3.1) 

and  there  exists  0,  >  0,  t  =  1,2, ...,  n,  such  that 

\\ef(F'(x)i  -  #'(y)i)||  <  0(||x-y||, 

r/x,  yiD. 

(3.2) 

Let  y  =  (2Yt2)2,  e  =  (  i0f)2,  a  =  (y2  +  02)^. 
i(c  i  =  1 

If  F’  satisfys  this  Lipschitz 

con- 

dition,  then  the  following  are  true: 

BF'frh-F'OOilr  ^  0||x-y||, 

V/x,y  £D  , 

(3.3) 

||F'(x)2-F'(y)2|lF  *  Yll*  ~  yll  » 

Y/x,  y  6  D  , 

(3.4) 

and 

IIF'U)  —  F'OOIIf  =£  a\\x  -  y||  , 

Y/x,yZD  . 

(3.5) 

Lemma  3.1.  Let  F'  satisfy  (3.1)  and  (3.2),  and  let  B  be  generated  by  Algorithm 
2.1.  If  x  iD  and  x  -  d\ C  D,  then  for  any  z  6  D, 
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(3.6) 


Wi-F'bhft  ^  lPi-F'{z)$  -  — ^ |(Bi  -  F'Cz)i)s||2 

IN2 

+  e2(\\x-z\\  +  ^\\dl\\)2  ■ 

Proof.  Substituting  F'{z)  for  A  in  (2.9),  we  obtain 

Pi  -  F'(g) i||  <  Pi  -  F'Uhllf  -  -  F'^sll2 

INI 

+  E  (W1)fW1]t)  +  [ef(y1-F'(2)d1)]2  . 

t  =  i 

By  (2.3),  (2.4),  (3.3),  and  Cauchy-Schwarz  inequality  we  have 

2  Wf[d1]i)+[ef(yl  -F’(z)di>]2  =  2  (Wi]/’Wi]i)+(er(^i-F'(z))1[di]i)2 

1=1  i=i 

^  2  ([di]ftd1]j)+||ef(Ji  —  F'(z))i||2||[di]j||2  <  2  I \ef (J ,  -  F ' {z))  ,\\2 

i=l  i  =  l 

1 

=  H(e/1-F'(z))1||f.  =  llj  (F'(J-  (1  -  t)dx)  -  F'{z))xdtfF  (3.8) 

o 

<  G^-zl  +  llldJ)2. 


Then  (3.6)  follows  from  (3.7)  and  (3.8). 


Lemma  3.2.  Let  F'  satisfy  (3.1)  and  (3.2),  and  let  B  be  generated  by  Algorithm 
2.1.  If  x  6  D  and  {  x  —  gh  i  =  2, ...,  p  }  C  D,  then  for  any  z  £D, 

Wz-F-iziS  <  |B2-(F-U)2lf--^||(B2-F-U)2)»||2 

M  (3.9) 

+  y2(PT-2i  +  M)2. 


Proof.  Substituting  F'{z)  for  A  in  (2.12),  we  obtain 
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(3.10) 


|B2  ~F'(z)$.  ^  \\B2-F'{z)S  ~  ^2\\(B2-F\z)2)sf 

INI 

(=2  j€c, 

It  follows  from  (2.3)  and  (3.1)  that 

£  2  s/SJ(^-F'(2))e/  <  £  2KJ,  -F'(2))e/ 

i=2  j€e,  i  —2  j(.c, 

=  1  2lJ(F'(x-^i  +  t(gi-ft_1))-F'(z))ctee,-|2 

<  =  2  i€c,  0 

^  £  2rKf(U* -  Nl  +  d  -  *)bil  +  <bi-il)*)2 

i  =  2  j€c,  0 

^  £  2  Y,?(l*  -  Nl  +  INI)2  =  y2(IN  -  4  +  INI)2  • 

i  =  2  jic, 

Thus,  (3.9)  follows  from  (3.10)  and  (3.11). 

Let 

d?  =  2  *fc  - 

and 

g?=£d},  i  =  1,2,  go  =  0  . 

j  =  i 

We  have  the  following  estimate  for  Sfe+1. 


(3.11) 


Theorem  3.3.  Let  F'  satisfy  (3.1)  and  (3.2),  and  let  { xk }  and  {Bk}  be  generated 
by  Algorithm  2.1.  If  {x4j=o  CD  and  {xJ  +  1  —  g{,  i  =  1,2 , ...,  p}j=0C.  D ,  then 

\\Bk+l  —  F'(x*  +  1)|lir  <  \\B0-F'{x0)\\f  +  2<x  £||xt  +  1  -x‘||  .  (3.12) 

i=0 


Proof.  Substituting  2  for  x  in  (3.6)  and  (3.9),  we  have 
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and 


|| B,  -F'ix)$  ^  || B1  -  +  (-f-IMill)2 


|| B2  -  F'(x)S  *  \\B 2  ~  F\x) 2||I  +  (Yllsll)2  • 

Therefore 

|| B  -  F'(x)  fF  =  ||S !  -  F'(i)il£  +  I1B2  -  F'(x)$ 

<  || B  —  F'{x)\\f  +  (©2  +  y2)||s||2 
=  ||F  -F'(x) |||  +  a2||s||2  . 

Then 

||B  -  F  '(J)||F  ^  IIF  -  F'&Wf  +  «ll*  “  *11  (313) 

<  ||B  -F'U)||F  +  2a||J-x||  . 

Thus,  (3.12)  follows  (3.13). 

From  (3.12),  we  have  the  following  Kantorovich-type  theorem  for  Algorithm 

2.1. 


Theorem  3.4.  Assume  that  F'  satisfies  (3.1)  and  (3.2).  Also  assume  that  xq€D  and 
B0  6  L(Rn)  satisfy 

||B0  —  F'0c°)||f  s  5,  \\Bq%  ^  /S,  HBo-^lx0)!!  <  t, 

and 


h  s  “PV  -  <  J-,  0S<|  . 

(1-3/35)2  10’  P  3 


If  S(x°, 2t*)  =  {x:  ||x-x°||  <  2 1* }  CD,  where 


t*  =  1  -  3P  -  (1  —  (1  —  10/i)  2 )  , 

5  ap 
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then  {xk},  generated  by  Algorithm  2.1  without  any  global  strategy,  converges  to  x  , 
which  is  the  unique  root  of  F(x)  in  S{x°,t )  HD,  where 


t  = 


1  -  >36 

ap 


1  + 


1  - 


2a/? t} 

(1  -PS)2 


i_ 

2 


Proof.  Consider  the  scalar  iteration 

=  2^BSntk)’  *»  =  0'  k=1 ’2 ,3'14) 

where 

m  =  |  «(2  -  <  + 1  (sis) 

2  p  p 

It  is  easy  to  show  that  {tk}  satisfies  the  difference  equation 

4  + 1  -  4  =  — [a(4  -  4- 1)  +  2a4_],  +  6K4  -  4-i)>  (3.16) 

1  -  <p 

where  <p  =  3  +  <  — .  From  (3.16),  we  see  that  {tj  is  a  monotonically  increas- 

5  3 

ing  sequence  and  that 

lim  tk  =  t* , 

k-*CC 

where  t  is  the  smallest  root  of  (3.15). 

Now,  by  induction,  we  will  prove  that 

\\xk  +  l  -xk\\  <  4  +  1-4,  k  =1,2, 

{xk}C  S(x°,t*), 

{xk+l+gf,  L  =  1,  2, ...,  pj-  C  S(x°,2^)  , 

and 

«s*rMi s  j 4^  -W’  k  =  1’2’ 


(3.17) 

(3.18) 

(3.19) 

(3.20) 
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For  k  =  0,  we  have 


Thus, 


II*1  -  *° 


V  - 


2-/35 


T}  =  tl  - 10  <  t 


Ik1  -  g?  -  x°U  <  H*1  -  jc°«  +  \\g?\\  ^  2H*1  -  *°||  ^  2 1*  . 
Suppose  (3.17)  holds  for  k  =  0, 1, m  —  1.  Then, 

„  m~  1  * 

-*°||  ^  2  (*«+l-0  =  tm  *  t  . 

i=0 

Therefore,  *m  6  S  (*°,  t*),  and 

{xm-g?-l,i  =  l,...,p}CS{x°,2t‘). 


By  Theorem  3.3, 

||B0-1(Bm-B0)ll 

<  ||B0_1llF(l|5m  -  F'(xm)||F  +  ||F'(*m)  -  F'(x°) ||F  +  | F'(*°)  -  B0IIf) 

m-i  *  3  +  fl5 

<  /8(3a  2  II*  ~  *‘ll  +  25)  ^  0(3 af  +  25)  <  r*-  =  <P  • 

(=0  a 

Thus,  by  Theorem  3.1.4  of  Dennis  and  Schnabel  [6,  p.45], 

|B;'I  s  s  3/3 

Therefore, 


||*m  +  1  -  *m||£  ||B ~1||f||F(*m)  -  F(xm~1)  -  -  xm  x)| 

<  -2—  [±\\xm  -  xm~l\\  +  2a  ” "2  ll*,  +  1  -  *1  +  "  *m_1H 

1-<P  2  £=0 

<  — ^ —  [a(tm  —  <m_l)  +  2 a(m_!  +  5]Um  -  £m-l)  =  *m  + 1  -  • 

1  -  <p 

This  completes  the  induction  step.  By  (3.17),  it  is  easy  to  show  that  there  is  an 
*  €  D  such  that 
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lim*^  =  x"  . 

k-*X> 

The  uniqueness  of  x  in  S(x°,t  )  D£)  can  be  obtained  from  Theorem  12.6.4  of  [9]  by 
setting  A(jc)  =  B  0. 


4.  Local  Convergence  properties. 

To  study  the  local  convergence  of  our  algorithm,  we  assume  that 
F:D  C Rn  -+Rn  has  the  following  property: 

There  is  an  x*  t  D  ,  such  that  F(x*)  =  0  andF'(x*)  is  nonsingular.  (4.1) 
Theorem  4.1.  Let  F  satisfy  (4.1),  and  let  F'  satisfy  (3.1)  and  (3.2).  Also,  let  { xk /  be 

generated  by  Algorithm  2.1  without  any  global  strategy.  Then,  there  exist  e,  S  >  0, 

such  that  if  x0  i  D  and  B0,  a  nonsingular  nXn  matrix,  satisfy 

|x°-x*|<e,  \\B0-F'(x*)\\f  <  8  , 

then  {xk}  is  well  defined  and  converges  q-superlinearly  to  x  . 

Proof.  Notice  that  when  e  and  5  are  small  enough,  we  have  that  h  ^  <  If 

and  that  S(x°,2t*)CD,  where  h,  and  t  are  defined  in  theorem  3.4.  Therefore, 
by  Theorem  3.4, 

{xk+1+gf,  i=U.,p/CD  . 

Thus,  substituting  x*  for  z  in  (3.6)  and  (3.9),  we  have 

|Bi  -F'(x*)&  ^  Pi  -  F'(x*)il£  -  ~  ||(Bi  -  F'ix’Wsf 

+  02(||x  -  x*||  +  |jsi|)2  ,  *4.2 
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and 


\\B2  -F'(x*)2§  <  \\B2-F'(x*)2§-  ^(B2-F'(x*)2)s\\2 

+  y2(||J  —  i*||  +  ||s||)2  .  (4.3) 

Then, 

|| B  -  F'(x')i=  || B,  -  F'{x*)S  +  |Ba  “  F'(x*)2\\j 

<  ||B  -  F'(x*) III  +  a2(pf  —  x*||  +  ||s||)2 

<  ||B  -  F'(x*)\\f  +  (3a<r(r  .t))2 


where  <x(jc,  J)  =  maxf  ||x  —  x  ||,  ||x  —  x  ||/.  Therefore, 

||S  -  F'{x*)\\f  <  ||B  -  F'(x*)\\f  +  3aa(x,  I)  . 

Thus,  by  Theorem  5.1  of  Dennis  and  More  [5],  { xk }  converges  at  least  q-linearly  to 

* 

X  . 


By  Theorem  3.1  of  Dennis  and  More  [5],  to  prove  q-superlinear  convergence, 
we  need  only  to  prove  that 

||(B*-FV))sfe|1 


lim 


l!sft|| 


=  0  . 


(4.4) 


Let  E  =  B  -F'{x*)  and  E  =  B  -  F'{x*).  Then,  it  follows  from  (4.2)  and  (4.3)  that 


2  1 


PiIIf  *  (||F1i  -  ~r)2  +  30<t(jc,  x)  , 

Nr 


(4.5) 


and  that 


P2|If  ^  (p2ll \  ^  •  (4-6) 

Nl 

From  (4.5)  and  (4.6),  using  the  same  argument  for  proving  the  q-superlinear  con¬ 
vergence  property  of  Broyden’s  algorithm  (see  Dennis  and  More'  [5]),  we  obtain 
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(4.7) 


||(B*-FV))lS*|| 

urn - - - 

*—  IIs  II 

and 

\\(Bk-F'(x*))2sk\\ 

urn - — - 

\\sk\\ 

Notice  that 


(4.8) 


\\(Bk  -F'{x*))sk\\  ^  \\(Bk  -F'{x*))lSk\\  +  \\{Bk  -  F'{x'))2sk\\  . 
Thus,  (4.4)  follows  from  (4.7)  and  (4.8). 
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